CSCI E-25¶

Extracting Features From Images¶

Steve Elston¶

Feature extraction from images is a fundamental aspect of computer vision. Image features must be extracted for many computer vision methods. Just a few of these are:

  1. Object recognition.
  2. Stitching images together.
  3. Locating and identifying objects in images.
  4. Models of motion from a series of images.
  5. Stereo vision and depth estimation.

In these exercises we focus on classical methods for extracting features from images. In state-of-the-art practice features are created using deep neural networks. Will address learning features with deep neural networks extensively latter. Rest assured that some background in classical methods will help you understand and appreciate the deep learning methods.

To get started, execute the code in the cell below to import the packages you will need for the exercises.

In [53]:
import skimage 
from skimage import data
from skimage.filters.rank import equalize, entropy
import skimage.filters as skfilters
import skimage.feature as feature
import skimage.segmentation as segmentation
import skimage.morphology as morphology
import skimage.transform as transform
from skimage.transform import probabilistic_hough_line
from skimage.feature import hessian_matrix, hessian_matrix_eigvals
from skimage.feature import hog
import skimage.util as util
from skimage.color import rgb2gray
from skimage import exposure
from skimage.draw import line as draw_line
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Preparing the Image¶

Regardless of the methods used an image must be prepared for feature extraction. Often many steps of correcting and filtering the image are required for proper feature extraction. An important step is the adjustment of the spectrum of the image. We need the spectrum to be as wide as possible. We call the transformation pre-whitening. The broad spectrum is considered white light has all spectral components equally represented. We have already discussed the process of whitening the spectrum of images, contrast improvement. An image with high contrast has a broad spectrum.

The cat image requires little preprocessing, beyond pre-whitening. Execute the code in the cell below to load and prepare the cat image.

In [54]:
def plot_grayscale(img, title, h=8):
    plt.figure(figsize=(h, h))
    _=plt.imshow(img, cmap=plt.get_cmap('gray'))
    _=plt.title(title)

cat_image = rgb2gray(data.cat())
plot_grayscale(cat_image, 'Image size = ' + str(cat_image.shape))

cat_grayscale_equalized = exposure.equalize_adapthist(cat_image)
plot_grayscale(cat_grayscale_equalized, 'Equlized image')

Important note! Unless otherwise stated, the cat_grayscale_equalized image should be used for subsequent exercises.

Edge Detection¶

Edges are a fundamental feature of am image. Edges are characterized by rapid changes in the intensity and are therefore associated with high gradients. The rapid change in intensity levels also means that the spectrum around an edge will have high-frequency components.

We can take advantage of the broad spectrum at edge features to create an edge detector algorithm. Recall that a Gaussian filter acts is low-pass, reducing the high frequency components. If we take the difference of the Gaussian filters with different bandwidths we can detect edges.

Exercise 3.1: To gain a bit of understanding from a simple case you will create a simple one dimensional example of an edge detector. In this case, you will apply a convolutional gradient operator to a square wave function that is first positive and then negative. Aficionados of obscure functions will notice this is a Harr basis function (Harr, 1909). You will find and display the changes in gradient of this function by the following steps:

  1. Create the 1-d convolution operator as a Numpy array: $[1.0,-2.0,1.0]$.
  2. Convolve the series with the operator using the numpy.convolve function. Make sure to set the mode='same' argument so the resulting series is the same length as the original.
  3. Plot the result with the plot_conv function provided.
In [55]:
def plot_conv(series, conv, span):
    x = list(range(series.shape[0]))
    plt.plot(x, series, label = 'Original series')
    plt.plot(x, conv, color = 'red', label = 'Convolution result')
    plt.legend()
    plt.xlabel('Index')
    plt.ylabel('Value')
    plt.title('Series of raw values and convolution')

series = np.concatenate((np.zeros((20,)), np.zeros((5,)) + 1.0, np.zeros((5,)) - 1.0, np.zeros((20,))))    
    
# Define the 1D convolution operator
conv_operator = np.array([1.0, -2.0, 1.0])

# Convolve the series with the operator
gradient = np.convolve(series, conv_operator, mode='same')

# Plot the result
plot_conv(series, gradient, span=(0, len(series))) 

Examine the plot and answer the following questions:

  1. Why are the 3-point convolution operator values correct in terms of sign, and normalization?
  2. Does the magnitude of the resulting series correctly represent the gradient of the original series, and why?
    End of exercise.

Answers:

  1. The 3-point convolution operator is chosen for edge detection because it effectively captures changes in intensity, highlighting edges where there's a rapid transition from bright to dark or vice versa.

  2. Yes, the magnitude of the resulting series accurately represents the gradient of the original series because convolution with the gradient operator captures the rate of change of the signal, thus emphasizing edges.

Exercise 3-2: You will now create and test an edge detector using the difference of Gaussian filtered images. Starting with the equalized gray-scale cat image, perform the following steps:

  1. Define a first vector of values of $\sigma_1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0]$ and a second value of $\sigma_2 = 0.5$.
  2. For each value of the $\sigma_1$ compute an image which is the difference between the Gaussian filtered images computed with $\sigma_1$ and $\sigma_2$.
  3. Display each resulting image, with the value of $\sigma_1$ in the title.
In [56]:
sigma1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0]
sigma2 = 0.5

# Define sigma values
sigma1_values = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0]
sigma2 = 0.5

# Compute difference of Gaussian filtered images for each sigma1 value
for sigma in sigma1:
    gauss1 = gaussian(cat_grayscale_equalized, sigma=sigma)
    gauss2 = gaussian(cat_grayscale_equalized, sigma=sigma2)
    
    # Compute difference image
    DoG_image = gauss1 - gauss2
    
    # Display the resulting image
    plt.figure()
    plt.imshow(DoG_image, cmap='gray')
    plt.title(f'Difference of Gaussian (sigma1={sigma}, sigma2={sigma2})')
    plt.axis('off')
    plt.show()

Answer the following questions:

  1. How does the gradient measurement of the images change as the value of $\sigma_1$ changes and what does this tell you about the scale of features in the image?
  2. Given the bandwidths of these Gaussian filters compared to the filter with $\sigma = 0.5$, do the changes of gradient measured make sense and why?
  3. Does it appear that the difference of Gaussian filters has detected high gradients with different orientations on the image, and why?
    End of exercise.

Answers:

  1. We observe that gradient measurement changes with sigma1 variations in the gaussian difference filter, highlighting finer details with smaller sigma1 and emphasizing larger structures with larger sigma1. This indicates sensitivity to feature scaling in the image.

  2. Yes, changes in gradient measurements with different sigma1 values make sense because higher sigma1 values correspond to broader bandwidths, capturing larger-scale features, while smaller sigma1 values highlight finer details.

  3. Yes, the gaussian difference filters detect high gradients with different orientations in the image as they are sensitive to variations across different spatial frequencies, effectively capturing edges and gradients with different orientations in the image.

Exercise 3-3: One approach to finding edges is to look for areas of maximum gradient in an image. The difference of Gaussian filters has a strong response to large gradients in image intensity that define edges. In many cases a boolean indicator of an edge is required. To create such an edge indicator from the difference of Gaussian filtered images (equalized gray-scale) with $\sigma_1 = 1.0$ and $\sigma_2 = 0.5$ perform the following steps:
1 Use the skimage.filters.threshold_otsu function to compute a threshold value for the gradient.

  1. Apply the threshold and create an image of boolean type.
  2. Display the boolean image.
In [57]:
s1=1.0
s2=0.5

# Compute difference of Gaussian filtered images
gauss1 = gaussian(cat_grayscale_equalized, sigma=s1)
gauss2 = gaussian(cat_grayscale_equalized, sigma=s2)
DoG_image = gauss1 - gauss2

# Compute threshold using Otsu's method
threshold_value = threshold_otsu(DoG_image)

# Create boolean image based on threshold
boolean_image = DoG_image > threshold_value

# Display the boolean image
plt.imshow(boolean_image, cmap='gray')
plt.title('Boolean Edge Indicator')
plt.axis('off')
plt.show()

Examine the results and answer the following questions:

  1. Has the binarization process captured the edges at all possible orientations, and why?
  2. Based on your observation, do you think the edge features are sufficient to identify the animal in the image as a cat within reasonable certainty?
    End of exercise.

Answers:

  1. Yes, the binarization process has captured the edges at all possible orientations as the difference of Gaussian (DoG) filter is sensitive to edges at various orientations in the image. By thresholding the DoG filtered image using Otsu's method, edges with different orientations and gradients are effectively captured and converted into a binary representation, where regions with high gradients are considered as edges and assigned a value of 1, while regions with low gradients are considered as non-edges and assigned a value of 0.

  2. No, edge features alone are not sufficient to identify that the image is a cat. While edges provide important information about the shape and structure, they may not capture sufficient detail or context to reliably identify that the animal is a cat.

Now that you have a bit of experience with edge detection, we will explore a few of the many commonly used edge detection algorithms. As you proceed notice how different algorithms produce different results.

Sobel edge detector¶

The Sobel edge detector uses the gradients of the pixel values to find edges. Edges are characterized by high gradient magnitude.

Like many feature detectors, a thresholding process must be applied to the gradients found with the Sobel filter. The result is a binary image showing the detected edges. The higher the threshold value, the fewer edge features that will be displayed. This is an example of nonmaximal suppression, a widely used method to filter features in computer vision.

Exercise 3-4: The Sobel filter is another form of edge detector. You will now apply the skimage.filters.sobel function to the equalized gray-scale cat image. In this case you will simply compute the norm and not the directions. Then, display the result.

In [58]:
# Apply the Sobel filter to compute edge norm
cat_sobel = filters.sobel(cat_grayscale_equalized)

plt.imshow(cat_sobel, cmap='gray')
plt.title('Sobel Edge Norm')
plt.axis('off')
plt.show()
  1. To examine the density of the Sobel filter output, execute the code in the cell below.
In [59]:
def plot_gray_scale_distribution(img):
    '''Function plots histograms a gray scale image along 
    with the cumulative distribution'''
    fig, ax = plt.subplots(1,2, figsize=(12, 5))
    ax[0].hist(img.flatten(), bins=50, density=True, alpha=0.3)
    ax[0].set_title('Histogram of image')
    ax[0].set_xlabel('Pixel value')
    ax[0].set_ylabel('Density')
    ax[1].hist(img.flatten(), bins=50, density=True, cumulative=True, histtype='step')
    ax[1].set_title('Cumulative distribution of image')  
    ax[1].set_xlabel('Pixel value')
    ax[1].set_ylabel('Cumulative density') 
    plt.show()

plot_gray_scale_distribution(cat_sobel)
  1. Next compute the binary image use the skimage.filters.threshold_otsu function to find a threshold. In a loop itterate over multipliers $[1.0,2.0,3.0]$. For each multiplier, apply the product of the multiplier times the threshold found and display the result. Makes sure you display the threshold value for each image. Hint: Notice that you can biniarize the image by taking pixels greater than or equal to or less than or equal to the threshold.
In [60]:
# Compute the threshold using Otsu's method
threshold_value = filters.threshold_otsu(cat_sobel)
print(f"Otsu's Threshold Value: {threshold_value}")

multipliers = [1.0, 2.0, 3.0]

for multiplier in multipliers:
    # Compute the threshold adjusted by the multiplier
    adjusted_threshold = multiplier * threshold_value
    
    # Create binary image based on the adjusted threshold
    binary_image = cat_grayscale_equalized >= adjusted_threshold
    
    plt.imshow(binary_image, cmap='gray')
    plt.title(f'Binary Image (Multiplier: {multiplier}, Threshold: {adjusted_threshold})')
    plt.axis('off')
    plt.show()
Otsu's Threshold Value: 0.1301034891474938

Compare the edges found with the Sobel filter with those found by the difference of Gaussian filters. Answer the following questions:

  1. Which aspects of the edges found by the two methods are substantially similar?
  2. How do the edge features detected by the two methods differ, in particular in terms of texture, continuity, compactness, etc?
  3. How do the edges found with the Sobel filter change with the threshold value, and why does this result make sense?

Answers:

  1. Both Sobel and Difference of Gaussian (DoG) filters detect edges based on changes in gradient in the image and highlight regions with significant intensity variations, representing potential boundaries.

  2. By comparing both filters, we observe that DoG filters capture finer details in edges better compared to Sobel filters. Sobel filters produce more continuous edges than DoG filters, which captures fragmented edges depending on the scale of features in the image. We also observe that Sobel filters generate more compact edges than DoG filters.

  3. Increasing the threshold value with the Sobel filter produces binary images with fewer edge pixels, whereas lowering the threshold value results in more edge pixels being detected, potentially including noise-induced edges.

Canny edge detector¶

As a point of comparison, run the code in the cell below to see the results of the Canny filter edge detector for various Gaussian smoothing parameters.

In [61]:
for sigma in [1.0,2.0,5.0]:
    cat_canny = feature.canny(cat_grayscale_equalized, sigma=sigma)
    plot_grayscale(cat_canny, 'Canny filter result, sigma = ' + str(sigma))

Compare the edges found with the Canney operator to those found with the Sobel algorithm. Notice that the edges found with Canney are thinner. Particularly, with larger smoothing parameters. Many parts of the cat's face, like the facial hair patterns form closed or nearly closed shapes, outlining the edge of each pattern element.

Hough Transform¶

The family of Hough transforms are operators that enhance lines in images that are poorly defined. Commonly used versions of the Hough transform enhance straight lines and curved lines, including circles. Three-dimensional versions of the Hough transforms have been been developed for surfaces.

Hough transforms are often used to enhance the features of images used for other computer vision algorithms. For example, enhancement of lines can improve the performance of deep learning algorithms.

We will focus on the Hough transform for straight lines. The Hough algorithm works by searching a space of possible lines. A simple search for all possible lines using the standard formulation of a straight line, $y = m\ x + b$, is computationally infeasible. Further for vertical lines the slop is not defined, $m \rightarrow \infty$. An alternative formulation for a straight line is the Hesse normal form.

$$r = x\ cos(\theta) + y\ sin(\theta)$$
 

Where:
$r =$ the length of a vector from the origin closes point of the line.
$\theta = $ the angle between the $x$ axis and the vector from the origin closest point of the line.

The Hesse normal form has several advantages for this problem:

  • The search for lines can be performed by searching a grid of possible $r$ and $\theta$ values.
  • The Hesse normal form is valid for any line orintation.

The grid of possible $r$ and $\theta$ values can be regarded as generating a series of proposals for the presence of lines. The values of the proposals are stored in an accumulator. Points where there are peaks in the accumulator are the most likely parameters of a line, $r$ and $\theta$.

To better understand the Hough transform, we will start with a synthic example. The code in the cell below creates a simple image with three straight lines on a black background. Execute the code and example the image.

In [62]:
three_lines = np.zeros((256,256))
three_lines[draw_line(132, 64, 196, 196)] = 1.0
three_lines[draw_line(96, 64, 64, 180)] = 1.0
three_lines[draw_line(64, 64, 175, 155)] = 1.0

fig, ax = plt.subplots(figsize=(6,6))
ax.imshow(three_lines, cmap=plt.get_cmap('gray'))
Out[62]:
<matplotlib.image.AxesImage at 0x7f9e7b7ff8e0>

We will now apply the basic strait line Hough transform to the simple image. The code in the cell below does the following:

  1. Uses skimage.transform.hough_line to search a range of test angles. The function returns the values in the accumultor, $h$, the angle, $\theta$ and the distance to the closest point on the line, $r$.
  2. The log of the accumlotor values is displayed over a range of angles and distances.
  3. The lines found are superimposed on the original image by iterating over all angles distriances tested to find the maximum accumulator values using the transform.hough_line_peaks function. This function returns the accumulator value, angle and distance of peaks detected. These peaks are estimates of the detected lines. Each line detected is displayed on the image.

Execute the code in the cell below of visualize the results.

In [63]:
test_angles = np.linspace(-np.pi / 2, np.pi / 2, 360, endpoint=False)
h, theta, r = transform.hough_line(three_lines, theta=test_angles)

fig, ax = plt.subplots(1,2, figsize=(12,5))
angle_rage = 180
r_step = 0.5 * np.diff(r).mean()
bounds = [-angle_rage,+angle_rage,
          r[-1] + r_step, r[0] - r_step]
ax[0].imshow(np.log(1 + h), extent=bounds, cmap=plt.get_cmap('gray'));
ax[0].set_title('Hough transform')
ax[0].set_xlabel('Angle in degrees')
ax[0].set_ylabel('Distance in pixels')

ax[1].imshow(three_lines, cmap=plt.get_cmap('gray'))
ax[1].set_ylim((three_lines.shape[0], 0))

ax[1].set_title('Detected lines')
ax[1].set_xlim(0,256)
ax[1].set_ylim(0,256)

for _, angle, dist in zip(*transform.hough_line_peaks(h, theta, r)):
    (x0, y0) = dist * np.array([np.cos(angle), np.sin(angle)])
    ax[1].axline((x0, y0), slope=np.tan(angle + np.pi/2))

The image on the right shows the accumulator values displayed by distance and angle. Notice there are three distinct points in the $r,\ \theta$ space where the sinusoidal lines converge. These points represent peaks in the accumulator, and are therefore represent the parameters in Hesse normal form of the detected lines.

Let's explore some properties of the Hough transform through an real image example. We will use Hough transforms to enhance the lines on the retina image. To start, execute the code in the cell below to load and display the image.

In [64]:
retina_equilized = exposure.equalize_adapthist(rgb2gray(data.retina()))
plot_grayscale(retina_equilized, 'Retina image')

Exercise 3-5: You will now apply the Hough transform to the euqalized retina image.

  1. Compute and display the Canny filtered image with the sigma argument set to 1.5.
  2. Display the gray scaled filtered image.
In [65]:
# Compute and display the Canny filtered image with sigma = 1.5
retina_canny = feature.canny(retina_equilized, sigma=1.5)
plt.imshow(retia_canny, cmap='gray')
plt.title('Canny Filtered Image (Sigma = 1.5)')
plt.axis('off')
plt.show()

# Display the grayscale filtered image
plt.imshow(retina_equilized, cmap='gray')
plt.title('Grayscale Filtered Image')
plt.axis('off')
plt.show()
  1. Complete the code in the cell below using the skimage.transform.probabilistic_hough_line function. A probabilistic Hough transform differs from the standard Hough transform by using maximum likelihood estimations, rather than finding peaks in an accumulator. Use the threshold and line_length arguments from the tupple created by the iterator. Set the line_gap argument to 3.
In [66]:
def plot_lines(lines, ax):
    for line in lines:
        p0, p1 = line
        ax.plot((p0[0], p1[0]), (p0[1], p1[1]), color='blue')

_,ax = plt.subplots(3,2,figsize=(10,15))
ax=ax.flatten()

thresholds = [5,20,5,20,5,20]
lengths = [5,5,10,10,20,20]
for i, params in enumerate(zip(thresholds, lengths)):
    threshold, line_length = params
    line_gap = 3
    
    # Call probabilistic Hough line transform
    lines = transform.probabilistic_hough_line(retina_canny, threshold=threshold, line_length=line_length, line_gap=line_gap)
    plot_lines(lines, ax[i])
    ax[i].set_title(f'Threshold: {threshold}, Line Length: {line_length}')

plt.show()

Examine the plots and answer these questions:

  1. How do the detected lines in the image change with the line_length argument?
  2. How do the detected lines change with the threshold argument. and how does this effect compare to changes in line_length?

Answers:

  1. We observe that shorter line_length values result in the detection of shorter, potentially fragmented lines, while longer values detect longer, more continuous lines. As the line_length changes from 5 to 20, we observe more breaks or gaps in the image.

  2. Lowering the threshold increases the sensitivity of line detection leading to the detection of more lines, whereas higher thresholds result in more stringent line detection, emphasizing only more pronounced lines. The threshold parameter globally adjusts the sensitivity of line detection whereas the line_length parameter specifically controls the minimum length of detected line segments, impacting the continuity and completeness of the detected lines.

Corner Detection¶

Along with edges, corners are another fundamental feature of images. Detection of corners is fundamentally more difficult than edges:

  1. Corners are 2-dimensional features, and require more than just a first order gradient for detection.
  2. Corners orientation based on the directions of edges forming them.
  3. The angel forming the corner is a fundamental characteristic and changing the angel changes the characteristic of the corner.
    As a result of these fundamental characteristics, corner detectors must be based on metrics of multi-dimensional changes in intensity. As an example, the Sobel edge detector is a $2 \times 2$ array of all possible second partial derivatives. This formulation allows us to determine both the presence and the orientation (direction) of corners.

Exercise 3-6: The Harris corner detector is one of the many widely use algorithms to detect corner features in images. The detector finds a set of x-y coordinates for each corner that meets a threshold of minimum distance. The threshold is used to perform nonmaximal suppression. Apply the skimage.feature.corner_harris function to the equalized gray scale cat image by the following steps:

  1. Apply the Harris corner detector to the equalized cat gray-scale image.
  2. Iterate over values of the min_distance argument, $[1,3,5,9,12,20]$ and then perform tee steps below.
  3. Use the skimage.features.corner_peaks to filter the corner features using the min_distance value.
  4. Print the minimum distance and the number of corner features the filtering.
  5. Side by side, plot the gray-scale and the gradient image computed with the difference of Gaussian filtered images with $\sigma_1 = 2.0$ and $\sigma_2 = 0.5$. Superimpose on both images the locations of the corners detected using a '+' marker.
In [67]:
sigma1 = 2.0
sigma2 = 0.5

min_distances = [1, 3, 5, 9, 12, 20]

# Compute the difference of Gaussian filtered images
DoG_image = gaussian(cat_grayscale_equalized, sigma=sigma1) - gaussian(cat_grayscale_equalized, sigma=sigma2)

for min_distance in min_distances:
    # Apply Harris corner detector
    corners = feature.corner_harris(cat_grayscale_equalized)
    corners_filtered = feature.corner_peaks(corners, min_distance=min_distance)
    print(f"Minimum Distance: {min_distance}, Number of Corner Features: {len(corners_filtered)}")

    plt.figure(figsize=(10, 5))

    # Plot the grayscale image
    plt.subplot(1, 2, 1)
    plt.imshow(cat_grayscale_equalized, cmap='gray')
    plt.title('Grayscale Image with Corners')
    plt.plot(corners_filtered[:, 1], corners_filtered[:, 0], 'r+', markersize=10)

    # Plot the gradient image with corners
    plt.subplot(1, 2, 2)
    plt.imshow(DoG_image, cmap='gray')
    plt.title('Gradient Image with Corners')
    plt.plot(corners_filtered[:, 1], corners_filtered[:, 0], 'r+', markersize=10)

    plt.show()
Minimum Distance: 1, Number of Corner Features: 3794
Minimum Distance: 3, Number of Corner Features: 1398
Minimum Distance: 5, Number of Corner Features: 604
Minimum Distance: 9, Number of Corner Features: 220
Minimum Distance: 12, Number of Corner Features: 123
Minimum Distance: 20, Number of Corner Features: 34

Examine the results and images and answer these questions:

  1. How can you describe the change in the number of corner features as the minimum distance is increased, and why?
  2. Do the features that remain as the minimum distance increases appear more robust (or obviously corners to the eye) given the gradient computed with the difference of Gaussians?
  3. Consider a trade-off between using a large set of features which may better represent a specific image vs. a sparser but more robust set of features. How might you decide how to optimally filter the features?

Answers:

  1. As the minimum distance is increased, the number of corner features decreases as increasing the minimum distance parameter imposes a stricter condition for corner detection. So, corners that are closer together are filtered out, resulting in fewer detected corner features in the image.

  2. Yes, as the minimum distance increases, the features that remain tend to appear more robust because a larger minimum distance parameter filters out weaker or less distinct corners, retaining only those features that exhibit stronger gradient changes.

  3. In order to decide how to optimally filter the features balancing feature richness and robustness, we have to experiment with parameters, employ cross-validation, leverage domain knowledge, and iterate.

Eigenvalues and Corner Detection¶

Now that you have worked with a corner detection algorithm, you will create your own using eigenvalues. This algorithm is a simplified version of the well-known SIFT algorithm. The algorithm uses the ratio of the eigenvalues to find corners with changes in gradient. The steps of this algorithm are:

  1. The Hessian for each pixel in the image is computed using the operator span specified, $\sigma$.
  2. The eigenvalues of the Hessians are computed.
  3. The ratio of the largest to the second eigenvalue is computed.
  4. The eigenvalue ratio is thresholded to reduce the number of features.

Exercise 3-7: You will now implement the corner detection and examine the results by the following steps:

  1. Compute the Hessian of the equalized gray-scale cat image using skimage.feature.hessian_matrix, iterated over values of $\sigma = [0.2,0.5,1.0,2.0,4.0]$.
  2. For each Hessian compute the eigenvalues using skimage.feature.hessian_matrix_eigvals.
  3. Compute the ratio of the magnitudes or absolute values of the first and second eigenvalues. These values can be found in the first (largest) and second elements of the list returned by the hessina_matrix_eigvals function.
  4. Apply a threshold of $0.01$ to the largest largest eigenvalues. If the larges eigenvalue is less than the threshold set the eigenvalue ratio to $10^{-6}$.
  5. Display images for the each of the 3 components of the Hessian, $[HRR, HRC, HCC]$, along with the image of the log of the thresholded eigenvalue ratio. Make sure to label all images with the operator span, each of the Hessian components, or log eigenvalue ratio.
In [68]:
sigma_list = [0.2,0.5,1.0,2.0,4.0]
threshold = 0.01

for sigma in sigma_list:
    # Compute Hessian matrix
    hessian = hessian_matrix(cat_grayscale_equalized, sigma=sigma, use_gaussian_derivatives=False)
    
    # Compute eigenvalues
    eigenvalues = hessian_matrix_eigvals(hessian)
    
    # Compute eigenvalue ratio
    eigenvalue_ratio = np.abs(eigenvalues[0]) / np.abs(eigenvalues[1])
    eigenvalue_ratio[eigenvalues[0] < threshold] = 1e-6
    
    plt.figure(figsize=(15, 5))
    
    # Display Hessian components
    plt.subplot(1, 4, 1)
    plt.imshow(hessian[0], cmap='gray')
    plt.title(f'Hessian HRH (Sigma = {sigma})')
    
    plt.subplot(1, 4, 2)
    plt.imshow(hessian[1], cmap='gray')
    plt.title(f'Hessian HRC (Sigma = {sigma})')
    
    plt.subplot(1, 4, 3)
    plt.imshow(hessian[2], cmap='gray')
    plt.title(f'Hessian HCC (Sigma = {sigma})')
    
    # Display log of thresholded eigenvalue ratio
    plt.subplot(1, 4, 4)
    plt.imshow(np.log(eigenvalue_ratio), cmap='gray')
    plt.title(f'Log Eigenvalue Ratio (Sigma = {sigma})')
    
    plt.show()

Answer the following questions:

  1. How can you describe the how the components of the Hessian change as the span of the operator, $\sigma$, changes and why?
  2. What differences can you see between the components of the Hessian and why? Hint, pay careful attention to the orientations of the detected features.
  3. What differences do you see in the feature image as the span of the Hessian operator, $\sigma$, changes and why? End of exercise.

Answers:

  1. As the span of the operator (sigma) changes, the components of the Hessian matrix adapt accordingly. Higher sigma values lead to smoother representations of intensity variations across the image. Specifically, HRH captures smoother variations in the x-direction, HRC reflects correlations between intensity changes along different axes, and HCC represents curvature variations along the y-direction.

  2. We observe that HRH captures horizontal features, HRC detects diagonal features, and HCC highlights vertical features. These differences enable comprehensive feature detection of edges and corners, enhancing the overall image characteristics.

  3. As the span of the Hessian operator (σ) changes, we observe some scale variation and orientation sensitivity. As sigma increases, the image seems to capture larger-scale features while potentially missing finer details and vice versa. Higher sigma values also provide more robust detection of edges and corners.

Interest Point Descriptors¶

In many computer vision applications, including image stitching, stereo vision, and flow (motion tracking), require the unique identification of interest points or key points.

The HOG Algorithm¶

The orientation of corners can be used as part of a representation or feature map of an image. As with corner detection, a great many algorithms for determining corner orientation have been developed. Here, we will only work with the HOG or histograms of oriented gradients algorithm. The algorithm finds the directional bin with maximum value for a histogram of the gradients over patches of the image.

The skimage.feature.hog function returns a list. The first element of the list is a vector of the magnitude of the gradients. The second element of the list is a $2 \times 2$ array of orientation vectors. When a visualization is performed, the long axis of the markers shows the orientation and the width of the marker indicates the magnitude of the gradient.

Exercise 3-8: You will now apply the HOG algorithm to the equalized gray-scale cat image. You will compare the results of using a different number of pixels per cell to estimate the histogram of 9 orientations by the following steps:

  1. Iterate over tuples for the pixels_per_cell argument with dimensions, $[(4,4), (8,8), (16,16), (32,32)]$ for the convolutional operator, and for each tuple do the rest of these steps.
  2. Compute the gradient orientations using the pixels_per_cell=pixelsand with visualize=True arguments.
  3. Print the pixels per cell tuple, with the dimensions of the operator and the number of HOGs returned by the function in the title of each image. The number of HOGs is in the first element of the list returned by the hog function.
  4. Display the gray-scale image of the corner orientations.
In [69]:
pixels_per_cell_tuples = [(4, 4), (8, 8), (16, 16), (32, 32)]

for pixels_per_cell in pixels_per_cell_tuples:
    # Compute the gradient orientations
    orientations, hog_image = hog(cat_grayscale_equalized, pixels_per_cell=pixels_per_cell, visualize=True)
    plt.figure(figsize=(8, 6))
    plt.imshow(hog_image, cmap='gray')
    plt.title(f'Pixels per Cell: {pixels_per_cell}, Operator Dimensions: {pixels_per_cell}, HOGs: {len(orientations)}')
    plt.axis('off')
    plt.show()

Examine the results and answer the following questions:

  1. How has the density of the corner direction features changes as the pixels per cell dimensionality increases and why?
  2. How does the magnitude of the gradient change as the pixels per cell and scale increases and does this make sense given the algorithm and why?
  3. How does the smoothness and consistency of the gradient directions as the pixels per cell dimensionality or scale increases and does this make sense given the algorithm and why?
  4. As the size of the operator increases, you can see that the representation of the cat image changes, becoming more abstracted. If you goal is to robustly represent the object what do you think the trade-off will be and why?
    End of exercise.

Answers:

  1. As the pixels per cell dimensionality increases, the density of corner direction features decreases due to the coarser gradient estimation, increased pixel sampling, and spatial aggregation associated with larger pixel per cell dimensions. We observe that fewer cells capture corner orientations, resulting in a reduction in the density of detected features.

  2. As the pixels per cell and scale increase, the magnitude of the gradient decreases due to the HOG algorithm's behavior reflecting spatial aggregation, smoothing effect, and decreased sensitivity to small-scale variations.

  3. As the pixels per cell dimensionality or scale increases due to the HOG algorithm's behavior reflecting spatial aggregation, smoothing effect, and decreased sensitivity to small-scale variations.

  4. As the size of the operator increases, the trade-off involves balancing robustness and specificity in cat image's representation. While larger operators enhance robustness and generalization, they sacrifice the ability to capture fine details and discriminate between similar objects.

The BRIEF Algorithm¶

As with edge and corner detectors, a great many interest point or key point detection algorithms have been developed. Here we will only work with one, the BRIEF algorithm. The BRIEF algorithm creates a hash of the gradient directions over patches of the image. These hashes can be matched between images to find common interest points.

Exercise 3-9: You will now apply the sskimage.feature.BRIEF function to the equalized gray-scale cat image by the following steps:

  1. Apply the Harris corner detection algorithm to the equalized gray-scale cat image.
  2. Find the Harris corner peaks with min_distance=5, and print the number of peaks found.
  3. Initialize the BRIEF feature extractor with patch_size=5.
  4. Extract the BRIEF descriptor hashes from the cat gray-scale image and the corner peaks found. The function returns an object with many attributes. The descriptor hash is the .descriptor attribute.
  5. Print the first 4 hashes, being careful to label them.
  6. Compute the Hamming similarity between the first hash and all subsequent hashes using numpy.logical_xor.
  7. Display a histogram of the Hamming distances using 30 bins and the range limited to the possible similarity values. Make sure you label the axes and provide a meaningful title for your chart.
In [70]:
# Apply Harris corner detection algorithm
harris_response = corner_harris(cat_grayscale_equalized)

# Find Harris corner peaks
corner_coords = corner_peaks(harris_response, min_distance=5)
num_peaks = corner_coords.shape[0]
print(f'Number of Harris corner peaks found: {num_peaks}')

# Initialize BRIEF feature extractor
brief_extractor = BRIEF(patch_size=5)

# Extract BRIEF descriptor hashes
brief_extractor.extract(cat_grayscale_equalized, corner_coords)
descriptor_hashes = brief_extractor.descriptors

for i in range(4):
    print(f'Hash {i + 1}: {descriptor_hashes[i]}')

# Compute Hamming similarity
hamming_distances = [np.sum(np.logical_xor(descriptor_hashes[0], descriptor)) for descriptor in descriptor_hashes[1:]]
num_bits = len(descriptor_hashes[0])

plt.hist(hamming_distances, bins=30, range=(0, num_bits), color='skyblue', edgecolor='black')
plt.title('Histogram of Hamming Distances')
plt.xlabel('Hamming Distance')
plt.ylabel('Frequency')
plt.show()
Number of Harris corner peaks found: 604
Hash 1: [False  True False False False False False  True False False  True  True
 False False False False False False  True False False False  True False
 False False False  True False False False False  True False False False
 False False False  True False False False False False False False False
 False False  True  True  True False False  True False False  True False
 False False False False False False  True  True False False False False
 False False  True False False False  True  True  True False False False
 False False  True False False False False  True False  True False  True
 False  True  True  True False False False False False False  True False
 False False False False  True False False  True  True False False False
 False False False False False False False False False False False  True
  True False False False False  True False  True False False False False
 False False  True False False False False  True False  True False False
 False False False False False False  True False False  True False False
 False False False  True False  True False False  True  True False  True
 False  True False False False  True False False False False False  True
  True  True  True False  True False False  True False False False False
 False  True False  True  True  True False  True  True  True False False
 False False False False  True  True  True False False False False False
 False False False False  True False  True False False  True False  True
  True  True  True  True False False  True False False  True  True False
 False  True False False]
Hash 2: [False False  True False False False  True False  True False False False
  True False False False False  True False  True False  True False False
 False False  True False False  True  True  True False False False  True
  True  True  True False  True False  True False False False False  True
 False False False False False  True  True False  True  True False False
  True False False  True  True False False  True False False  True  True
  True  True False  True False  True False  True False False  True  True
 False  True False False  True  True  True  True False  True  True  True
 False False  True False False  True False False False False False False
  True  True False  True False False False False  True  True False  True
  True  True False  True  True False  True  True False  True False False
 False False False False False False  True False False  True False  True
 False False False False False False False False  True False  True False
  True False False False  True  True False  True  True False False False
  True False False False False  True False False False False  True  True
 False False  True False False False False  True False False  True False
 False False False False  True  True  True  True False False False False
 False False False False False False False False False False False False
  True False False False False False  True  True  True False False  True
  True False  True False  True False  True False  True False False False
 False False  True False  True  True False False False False False  True
 False False  True  True]
Hash 3: [False False  True False False False  True  True  True False  True False
 False False False False False  True False  True False  True  True False
 False False False False False  True False  True False False False  True
  True  True  True False False False  True False False False False  True
 False False False False False  True  True  True  True  True False False
 False False False  True  True False False False False False  True  True
  True False False False False  True  True False  True False  True  True
 False  True False False  True  True False False  True False  True False
 False  True False  True  True  True False  True False False  True False
 False False False  True False False False  True False  True False  True
  True  True False  True  True False False  True False  True  True  True
 False False False False False False  True False False  True False  True
 False False False False  True False False False  True False  True False
 False False False  True  True  True False False False  True False  True
  True False False False False  True False False False  True  True  True
 False False  True False  True  True False  True False False False  True
 False  True False False False False False False False  True False False
 False False False False False  True False False  True False  True False
  True False False False False False False  True  True False False  True
  True False  True False False False False False False False  True  True
 False False False  True  True False False False False False  True  True
 False False False  True]
Hash 4: [False  True  True False False False False  True False False  True False
 False False  True False False False  True False False  True  True False
 False False False  True False  True False False  True False False False
 False False False  True False False False False False False False False
 False False  True  True  True False  True  True False False  True False
  True False False False False False  True False False False False  True
 False False  True False False False  True False  True False False  True
 False False  True False  True False False False  True False False False
 False  True False  True  True False False  True False False  True False
 False False False False  True False False  True False  True False False
 False False False  True False  True False False False False  True  True
  True False False False False  True False  True False False False False
 False False  True False  True False False  True  True  True  True False
 False False False  True False  True  True False  True  True False  True
  True False False  True False False False False  True  True False  True
 False False False False  True  True False False False False False  True
  True  True  True False False False False False False  True False False
 False False False  True  True  True False False  True False  True False
 False False False False  True False False False False False False  True
 False False False False False False False False  True  True  True  True
  True False False  True False False  True False False  True  True False
 False  True False False]

Examine histogram of the hash similarity values. Keeping in mind the maximum and minimum possible values, what statement can you make about how strong the similarity is between these interest points. Try to explain your reasoning.
End of exercise.

Answer: Examining the histogram of the hash similarity values provides insights into the strength of similarity between interest points. SInce the histogram above is skewed towards lower values, we can say that many interest points have similar BRIEF descriptor hashes, suggesting a higher degree of similarity among these interest points.

Texture¶

Texture is a ubiquitous feature of nearly all images. However, stating exactly what texture is, and how to measure it, is not simple or strait forward. In general, we can say that texture is a function of local variation of the image. Typical measures used to quantify texture are:

  • Covariance over local patches of the image where regions with higher covariance have rougher textures.
  • Entropy computed over local patches of the image where regions with higher entropy have rougher textures.

Here we will focus on local entropy to measure texture using the skimage.filters.rank.entropy function. ids

Exercise 3-10: You will now use local entropy as a measure of texture on the equalized gray-scale cat image by doing the following.

  1. Iterate over the disk patch radii, $[3,6,12,24,48,96]$, and for each value do each of the following steps.
  2. Compute the local entropy of the equalized gray-scale cat image using the skimage.morphology.disk for the patch argument, and using the diameter value. Make sure you convert the image to unsigned 8-bit integer with skimage.util.img_as_ubyte.
  3. Side by side, display the image and a histogram of the entropy values with 50 bins. Include a title with the disk diameter and axes labels for the histogram.
In [71]:
disk_radii = [3, 6, 12, 24, 48, 96]

for radius in disk_radii:
    # Compute local entropy
    local_entropy = entropy(util.img_as_ubyte(cat_grayscale_equalized), morphology.disk(radius))
    fig, axes = plt.subplots(1, 2, figsize=(12, 6))
    
    # Display image
    axes[0].imshow(local_entropy, cmap='gray')
    axes[0].set_title(f'Local Entropy (Disk Diameter = {2 * radius})')
    axes[0].axis('off')
    
    # Display histogram
    axes[1].hist(local_entropy.ravel(), bins=50, color='skyblue', edgecolor='black')
    axes[1].set_title('Histogram of Entropy Values')
    axes[1].set_xlabel('Entropy Value')
    axes[1].set_ylabel('Frequency')
    
    plt.show()

for radii in [3,6,12,24,48,96]: 
    cat_entropy = entropy(util.img_as_ubyte(cat_grayscale_equalized), morphology.disk(radii))
    fig, ax = plt.subplots(1, 2, figsize=(16,5))
    _=ax[0].imshow(cat_entropy)
    _=ax[0].set_title('For disk radii = ' + str(radii))
    _=ax[1].hist(cat_entropy.flatten(), density=True, bins=50, histtype='step')
    _=ax[1].set_title('For disk radii = ' + str(radii))
    _=ax[1].set_ylabel('Density')
    _=ax[1].set_xlabel('Entropy')

Examine these results and answer the following questions:

  1. How can you describe the change in the image with operator diameter, and what does this tell you about the scale of the features?
  2. At what scale does the histogram exhibit distinctly multi=modal behavior, and what does this mean in terms of possibly segmenting the image into regions with common properties?
    End of exercise.

Answers:

  1. As operator size increases, the features defined by entropy be come larger and more continuous looking. This behavior corresponds to increasing feature scale by the octaves of the operator diameter.
  2. At an operator diameter of 48 the histogram has distinctly bi-modal behavior. Partition by these modes allow the image to be divided into several domains or regions.

Local entropy shows texture properties of an image. There are several uses for these properties in CV algorithm. Segmentation of an image into regions by texture is one such algorithm.

The code in the cell below segments the image into 3 regions by value of the entropy. Execute the code and examine the result.

In [72]:
radii = 24.0 
threshold1 = 7.8
threshold2 = 7.5
cat_entropy_96 = entropy(util.img_as_ubyte(cat_grayscale_equalized), morphology.disk(radii))
cat_entropy_96 = np.where(cat_entropy_96 > threshold1, 1.0, np.where(cat_entropy_96 > threshold2, 0.5, 0.0))
#cat_entropy_96 = np.where(cat_entropy_96 < threshold2, 0.0, cat_entropy_96)

fig, ax = plt.subplots(1, 2, figsize=(16,5))
ax = ax.flatten()
_=ax[0].imshow(cat_entropy_96)
_=ax[0].set_title('Segmentationn for disk radii = ' + str(radii))
_=ax[1].imshow(cat_grayscale_equalized, cmap=plt.get_cmap('gray'))
_=ax[1].set_title('Original cat image')

Compare the regions found by the segmentation to the original gray-scale image. Notice how the segmentation of the image by texture reasonably matches the large-scale features of the cat image. This segmentation could be used to mask the areas of image of interest, for example.

Copyright 2021, Stephen F Elston. All rights reserved.¶